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‘ EFFects oF Time LIMITS ON TEST-TAKING BEHAVIOR 


Rate and accuracy of response are important vapiables in the study 
of ability test, performance. Response rate is reflected in several 
kinds,of measurements, including average response latency, the time 
taken to complete a test, the number of item responses that are made 
in a fixed period of time, and the averdge number of item responses 
per unit of time. Accuracy of response refers measurements such 
as the number of items answered correctly and tWe proportion of items 
answered correctly by a given individual. As”’the terms are-used here, 
rate and accuracy refer to response characbtristies; they are 
characteristics of test-taking behavior,.' These terms should not be 


confused with the sh a i and power, which refer to test characteristfcs. | 
Ae : ~ 


The relationship between ay and accuracy of response has been . 
studied actively since the turh of the century. Much of the research ° eternal 
has compared test scores obtained under time-limit versus no-time-limit 

‘. conditions. Previous reviews of the relevart literature have béen ’ 
presented by Highsmith (1925), MoFarl (1928), Tryon and Jones (1933), 
Bennett (1941), Himmelweit (1946), an ummenm#a (1960). .An 
extensive bibliography has been provided by Morrisort (1960). 


Miller (1974) has identified three major viewpoints with regard if 
to rate and accuracy in ability test performance: 1) rate and 
accuracy are indicative of the same underlying ability; 2) rate _ \ 
and accuracy represent separate abilities;/and 3) rate and accuracy ; 
depend upon personality and motivational factors as well as upon ; \ 
abllity. The first viewpoint was typified by the work of Spearman 
(1904,. 1927) and a number of early researchers at the Harvard Psychdlog- 
ical Laboratory (McFarland, 1930; Peak and Boring, 1926). Th 
second viewpoint derives from the research of Baxter (1941), a 
‘and Carfoll (1945), and Horn and Bramble (1967). Thurstone (1987) set 
the stage for research motivated by the third viewpoint, dealfng With 

sep 
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time limits and test bias. Mych of the psychometric rese 
with time-limit versus no-time-limit testing conditions 
by one or another of these viewpoints. 
A distinction between rate and accuracy of response becomes par- 
ticularly important when tests are administered under time-limit 
conditions. An individual's time-limit test score, or the number 
of ,items answered correctly within a time period, is a function of . 
both rate and accuracy of response. If the testee maintains a constant 
level of accikaey but increases his/her rate, his/her time-limit test 
score will incréase. If the.testee maintains a constant rate of response 
but increases his/her accuracy, his/her time-limit test scoré will 
increase. Similarly, decreases in rate or accurae#\will caase 
corresponding decreases in time-limit test scores. “Unfortunately, 
many of the previous studies comparing time-limit and no-time-limit 
testing conditions have used the number of correctly answered items 
as the major dependent variable. A better understanding of test- 
taking behavior undet time-limit ia anes pe conditions 
would be obtained by using separate meastrements of rate and accuracy 


‘ 
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of response. ° ~ 


Separating the effects of rate and accuracy in ability!test 

performance has other advantages as well. Since different indi- 
viduals can obtain the same time-limit test score through different 
combinations of rate and accuracy, time-limit test scores may be 
factorially complex. Because of their complexity, they may be 
more difficult to interpret and less useful in predicting external 

co criteria. It is often argued that time-limit testing” procedures 
penalize the slow but accurate responder. By obtaining separate 
rate and accuracy measurements, the slow but accurate responder 
can be identified and his test-taking behavior studied. Rate and 
accuracy scores could also be used as separate measurements of an 
individual's ability level. 


By using computerized test administration it is poss{ble to make 
* accurate measurements of ‘item response latencies. Suah Paforaaticn 
might have diagnostic and predictive utility, especially in situations 
where individuals have different rates of response but similar levels 
of accuracy. Item response latencies can be utilized in the study of ‘ 
test-taking styles. | Response rate or test-taking styles could also 
prove to be important moderator or predictor variables in prediction 
-Studies. _ : - 
4 ’ 

Within the context of the research on adaptive ability testing 
reported in this series, the present research illustrates an additional 
application of on-line computers in psychological measurement. 
Previous research into rate and accuracy in ability test performance 
was limited to paper-and-pencil test administration. The present 
research utilizes computerized test administration to make accurate 
measurements of item response latencies and response rates. The three 
studies reported below utilize separate measurements of rate and 
‘ accuraty of performance. Thesé measurements of rate and accuracy are 

studied under both time-limit and no-time-limit conditions of ability 
test administration. 


General Methodology 


Each of the three studies reported below was addressed to the 
same basic question: What are the differences in test-taking behav- 
ior under time-limit versus no-time-limit conditions? Test-taking 
behavior was operationalized by measurements of ‘hoth rate and accuracy / 
of response and by an intra-individual analysis of response latencies. 

R Different experimental designs were, used in each of the studies, but 


in many’ respects the studies employed the same general methodology. / jetinces 
Testees 
= The testees for the three studies were undergraduate student 


volunteers from the University of Minnesota. Priér to experimental 


testing the students were informed that they would be taking a multiple- 
choice test of verbal ability and that they would receive*a penny for 
every correct answer that was given. There were 72 students in Study l, 
30 in Study 2, and 30 in Study 3. 


Ability Tests : 

The tests consisted of multiple-c oice vocabulary items. A 
complete listing of these itefs is given in Miller (1974); McBride 
and Weiss (1974) describe the calibration of the item pool. Study 1 
utilized an untimed pretest consisting of 100 items and an experimental 
test consisting of 250 items. Studies 2 and 3 utilized two experimental 
tests*consisting of 175 items each. 


. 
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The item ordey within the experimental tests was determined as 
follows. First, the test items were grouped according to difficulty 
level. The p- level, or the proportion of individuals in the norm 
group answering an item correctly, was used as the index of item 
difficulty. The tests were composed of blocks of items, and éach 
block of items within a test.contained one item chosen at random 
from each difficulty level. The order of items within blocks was 
randomized ,in order to avoid introducing any cyclical effects. Thié 
procedpre of item arrangement insured that the average item difficulty 
of fou block of items would be approximately the same throughout each 
of the tests. A more detailed description of this method for arranging 
test items is given by Miller (1974). 


. Administrative Conditions ; / 


_ Each of the three studies utilized a Control Data Corporation 3200 
digital computer to provide on-lin® control of the experiment. Test 
einstructions and experiméntal test, items were presented on cathode-ray 
terminals (CRTs). equipped with a typewriter keyboard for the recogding 
of responses. The system could be used to administer tests to as 

many as six subjects at a time. Complete items were written on the 
CRTs. instantaneously. There was virtually no delay between an indi- 
vidual's answering one item and the presentationrof the next item. 
Item response lafencies were recorded in milliseconds, although 

thése measurements were accurate only to a tenth of a second when 
testing more than one individual at a time. Each item response was 
examined for admissibility, and skipping items was not allowed. 


Each of the three studies employed time-limit and no-time-limit 
conditions, but the assignment of testees to experimental conditions 
varied from one study to the next. The nature of the testing condi- 
tions was described to the students in a series of instructional frames 
presented by the computer (see Miller, 1974, pp. 233-256). To-insure 
that each testee was aware of the testing conditions, he/she was required 
to respond correctly to a serjes of computer-administered .multiple- 
choice questions- about the test instructions. Under a time-limit 
condition each item was presented with the item number in the upper 


as 


right-hand corner of the screen and with the time remaining (in minutes 
and seconds) at the bottom of the screen. Under a no-time-limit 
condition each item an Vaan with the item number alone. 


Dependent Variables / 


Rate and accuracy of response constituted the primary dependent L ( 
variables of interest. An individual's response rate was defined as 

the average number of item responses per minute. An individual's 

accuracy of response was defined as the proportion of items answered 

correctly out of those attempted. These two dependent variables were 

analyzed in separate analyseg of variance, using experimental designs 

which varied across studies. . 
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Item response time or item response latency is the length of time 
between the presentation of the test item and the testee's response to 
that item. Response laténcy is determined by three tomponents: 1) 
The time it takes to read the item; 2) the time it takes to arPive 
at a solution to the item; and 3) the time it takes to record one's 
solution to the item. These three components--reading time, solution 
time, and recording time--will vary between individuals and between 


_ items. v 


@ 

By using multiple-choice items of similar length, inter-item 
differences associated with reading time can be reduced. Consequently, 
each item used in these studies consisted of a single stimulus word 
and five one-word alternatives. The task in responding to each item 
was the same: the testee.was instructed to find the alternative 
closest in meaning to ttfe stimulus word. By using items with a 
similar format and by administering items by computer, inter-item 
differences associated with recording time were red . Thus, by 
standardizing conditions relating to reading time and recording time, 
the major influence on a testee's rate of response was solution time. 


Study IT ‘ : < 


This study was designed to investigate individual differences in 
test-taking behavior under time-limit and no-time-limit conditions. 
The study employed a randomized block design with testees blocked 
into high-- and low-ability groups according to their performance on 
a pretest. Such a desigh permits analysis of the extent to which 
high- and low-ability groups perform Freceaifiy under time-limit 
and no-time-limit testing conditions. j ‘ ; 


‘ 
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Method 


Procedure. In this study students first were administered a pretest 
consisting of 100 multiple-choice vocabul ry items. This test was used 
to block the es oe into two ability groups. Testees scoring above the 


_median on the pkestest were assigned to the high-ability group, and 


those scoring be the median on the/pretest were assigned to the 
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low ability group. Testees from each ability group then were assigned 
randomly to one of the administrative conditions. 


Under the time-limit condition testees were ‘told that there was 
a 25-minute time limit and that the tes contained 250 items. They 
were told that they would have to apie an ayérage of 10 items per 
minute im order to finish the test. Under the no-time-limit condition 
the testees were told that there was no time limit and that the test 
contained 250 items. Students in both groups were told they would 
receive one cent for each item answered correctly. 


Data analyses. The observations,for rate and accuracy of response 
were analyzed in separate analyses of variance using 2 x 2 factorial 
designs. ‘The first factor was the testing condition, time-limit versus 
no-time=limit. The second factor was ability level, high-ability group 
versus low-ability group. Pearson product-moment c)rrelations were 
computed between selected pairs of the following variables: testing 
condition (0=time-limit, l=no-time-limit), pretest score, response 
rate, and response accuracy. : 4 


The relationship between item difficulty-(p-value) and mean item 
response latenc\ was studied for each of the four cells in the design. 


For each of the 250 items the item p-value was obtained from the i Fir 
calibration study data (McBride & Weiss, 1974), and the mean item 
response latency was obtained by averaging the response latencies of 

all testees who attempted thg item. eee oe 


In addition to these analyses, a test. response record was obtained 
for each individual in the study.- This response record was a time- 
series plot of the number of responses per minute during each minute of 
the testing session. : 

4 

\Results S 

Response rate. Table 1. shows the means and standard Vintanais for 

response rate (number of responses per minute) in the four experimental 
groups. Also shown are the results of the analysis of variance using 
testing condition and ability-level as the independent variables and 
response rate as the dependent variable. The mean response rate under 
the time-limit condition (8.56) was significantly higher (p<.001)\ than 
the mean responsé rate under the no-time-limit condition (5.77). The 
mean response rate for the*high-ability group (7.61) was higher (p<.10) 
than the mean response rate for the low-ability group (6.73). There 
was no significant interaction between testing conditions and ability 
levels. The point-biserial correlation describing the degree of » 

freastctatty between testing conditions and response rate was -.59. 

Response accuracy. Table 2 shows the means and standard deviations 
for response accuracy in the four experimental groups. Also shown are 
the results of the analysis of variance for response accuracy. The 
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difference between mean response accuracies under time-limit versus 
no-time-limit conditions was not statistically significant. The mean 


' Qo 
ne % _ Table 1 
Means, Standard Deviations and Analysis of Variance 


for the Number of Responses per Minute 
Testing Céndition 


4 and Ability Level U N Mean S.D. 
Time-Limit Condition i 
Total group ; 36 8.56 2.20 
High-ability 18 9.00 1,68 
Low-ability 18 8.12 2.59 
,No-Time-Limit Condition : 
Total group 36 5.77 1.61 
High-ability } 18 6.21 1.30 
Low-ability 18 5.33 1.80 
Ability Group (Conditions Combined) 
' High-ability 36 7.61 2.05 
we Low-abjlity 36 6.73 2.63 
Analysis of Variance ' 
Source of Variation ‘ df MS F p @ 
Testing Condition 1 140.11 38.75 <.001 
Ability Level 1 13.87 3.84° .051 
: Testing Condition x j 
ability level 1 .00 -00 -995 


“ Error 68 3.62 J 


accuracy score for the high-ability group (.60) was significantly higher 
(p<.001) than the mean accuracy score for the low-ability group (.36). 
This result was expected, however, since scorés on a vocabulary pretest 
were used to block individuals into ability levels. The correlation 
between pretest scores and agcuracy scores was .95. There was no signif 
icant interaction between testing conditions and ability levels in 
determining response accuracy. The correlation between response rate and 
response accuracy was .17 for individuals under a time-limit condition 
and .18 for individuals under a no-time-limit condition. Neither of 
these correlations was significantly different from zero. 


Item response latencies. Figure la shows the bivariate distribution 

of item difficulties (p-values) and mean item response latencies (in 

seconds) for the high-ability group under the time-limit condition. The 
regression line for predicting mean latencies from item difficulties is 

plotted on this figure. For this group the correlation between item 

difficulty and mean item response latency was -.36 (p<.01). Figure lb ’ 
shows the bivariate distribution of item difficulties (p~values) and mean 
“item response latencies (in seconds) for the high-ability group under the 


y | < 


no-time-limit condition, and the regression line for predicting 
mean latencies from item difficulties. , 


Table 2 
Means, Standard Devisions and Analysis of Variance 


: for the Proportion of Ite Answered Correctly 
Testing Condition : ze 
and Ability Level N Mean S.D. 


Time-Limit Condition . 
Total group 36 y47 16 
High-ability 18 59 : «14 
Low ability 18 36 -06 

No-Time-Limit Condition 
Total group ; 36 -48 ‘ 16 
High ability 18 -61 -10 
Low-ability 18° ah) .07 

Ability Group (Conditions Combined) 

High-ability 36 ~ .60 112 
Low-ability 36 - 36 -07 


Analysis of Variance 


Source of Variation df MS F p 
Testing Condition il .0033 36 -555 


Ability Level 1 1.1225 122.69 <.001 


Testing Condition x 
ability level . 1 -0070 77 -611 
Error. 68 -0091 


Figure 1 
Mean Response Latency as,a Function of Item Difficulty 
for the High-Ability Group - 


(a) (b) 
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For this group the correlation between ‘item difficulty and mean item 
response latency was -.56 (p<.01). Both correlations were negative: 
“longer response latencies were associated with the more difficult 
items (items with lower p-values). The difference in the magnitude of 
the correlations, which was statistically significant (p<.05), reflects 
the fact that testees spend proportionately less time in responding to 
the more difficult items under a time-limit condition than under a , 
no-time-limit condition. A similar pattern of results was obtained for. 
the low-ability group (see Miller, 1974, pp. 68-71). . 


, Figure 2 ( 
Sample Response Records Resulting from’ ~ 
Time-Limit Administration 


R 
E 
Ss 
P 
0 
N 
KS 
E 
E 
\ 
TIME IN MINUTES 
13 
uf le 
as 11 
R 10 
—E 9 
Ss 8 
P 7 
¢ % 
N 
5 @ 
—E ‘3 
$° 2 
} 
0 
15 20 
: ; ; TIME IN MINUTES 


- 


Test response records. The test response records for all of the 
individuals in this study are shown in-Miller (1974); only four are 
presented here for illustrative purposes. Figure 2 shows sample test 


' 13 


response records for two testees who received the time-limit condition. 
Both testees whose response records are shown in Figure 2 completed 

all 250 items in the test within the time limit. Both testees received 
approximately the same number-correct score (104 for the upper response 
“record and 107 for the lower), but their "styles" of test-taking 
behavior were quite different. ‘The upper response record is typical | 
of test-taking behavior under time-limit conditions. Most testees show 
a generally, increasing response rate under these conditions, implying 
an adaptation effect or a motivational effect due ph time limit. 
However, as the lower part of Figure 2 shows, some t@&tees obtained 
identical scores even though they worked at a consistent speed, without 
evidence of adaptation or motivational effects. 


r a 
Figure 3 
Sample Response Records Resulting from 
No-Time-Limit Administration 
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Figure 3 shows two sample response records under the no-time-limit 
testing condition. Both testees obtained approximately the same 
number-correct score (143 for the upper response record and 150 for the 
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lower). te mae response record is characteristic of regponse rate 
behavior under no-time-limit conditions. This testee answered a rela- 
tively constant number of items per minute throughout the test and 
took 70 minutes to complete .the test. On the other hand, the test 
response record shown.in the lower part of Figure 3 shows the charac- 
teristic adaptation or motivational pattern of the time-limft records, 
even though the test was administered under no-time-limit copditions. 


Some of the resonse records show even more unusual patterns of | 
response rates (see Miller, 1974, pp. 167-202). Thus, there is a wide 
range of individual differences in response rate patterns both between © 
and within time-limit and no-time-limit testing conditions. 


Conclusions. The results of this experimental study indicated 
that higher response rates are to be expected under time-limit condi- 
tions. There were no significant differences. between response accura- 
cies for time-limit versus no-time-limit conditions. The lack of'a 
Significant interaction between testing.conditions and ability levels 
for either response rate or response accuracy indicates that testees 
_ from different ability levels show similar patterns of response in 
adapting to time-limit and no-time-limit conditions. By examining 
response latency data as a function of item difficulties under the two 
administration conditions, one n see what these response patterns 
entail. There was some evidence in these data that under a time-limit 
condition subjects spend proportionately less time in respdnding to the 
‘more difficult items than they do under a no-time-limit condition 
Students maintain their level of accuracy under time-limit conditions 
although they increase their response rate. They do this by spending 
less time on the more difficult items than they do under no-time-limit 
conditions. The individual response records showed wide individual ° 
differences in behaviér, implying different test-taking "styles". These 
different styles, if reliable within individuals, could be important 
moderator variables whose use might improve predictive validities 
based on ability test scores. 


. 


Sgudy Zz 


This study was designed to investigate intra-individual variability 
in rate and accuracy of response. Study l\showed different patterns of 
test-taking behavior between time-limit and no-time-limit conditions 
using a between-subjects design. This study used a within-subjects - 
design. | : : *, 

Method. + ; a 

s Procedure. Each student received two 175-item tests in guccesgion, 
with one test administered under. a time-limit condition and-the other 
under a no-time-limit condition.’ Each testee was randomly assigned to 
one.order of administrative conditions. The design was counterbalanced: 
15 testees received the time-limit condition first, and 15 received the ~ 


,no-rime-limit condition first: Undgr the timé-limit condition, testees 
were told that. the test cortained re than 150. items and that there _ 
was a 15-minute time limit. They pers told that they would have to 
answer more than 10 items per minute in order to finish the test. 


Under ‘the no-time-limit condition, testees were told that there was no 
- time limit, but they were mot told she of items in the test. ¥ 


The testing: sessions were terminated e end of 15 minutes regardless 
of the testing condition. Again, each)testee was paid one cent for each 
correct answer. ‘ : ; 
& 
Data analyses. Repeated measurements analyses of variance were 
‘employed. Testing condition represented a within-subjects factor with 
two levels: time-limit condition versus no-time-limit condition. The 
order of testing conditions- represented a between-subjects factor with 
two levels: time-limit condition firSt versus no-time-limit condition 
first. Pearson product-moment correlations were computed between selected 
‘pairs of the follof#Ming variables: time-limit response rate, time-limit 
response accuracy, no-time-limit résponse rate, 4nd no-time-limit response 
accuracy. The mean response latency for correctly answered items was 
mpared to the mean response latency for incorr tly anSwered items. . 


ese mean latencies were examined for both time-limit and no-tifhe-Iimit’ 
. conditions. . « 
r ‘Table 3 
Means, Standard Deviations and Analysi& of Variance 
fot Number of Responses per Minute 
Ordér and Type of 
Testing Condition _ _ 
Time-Limit Condition First (N=15) 
we: Both: conditions combined 
Time- limit 
No- time-btmit 
No=Time-Limit Condition First (N=15) 
-,» Both conditions combined 4 
“%  Time-Limit 4 
No-time-limit ‘ 
Testing Condjtion (Orders Combined) ya 
Time-limit condit 
No-time-limit ee Ca 


1 


Ps 2 
Analysis of ‘Variance 
df 


"8.48 . 
,{ Subjeéts within groups 10:72 : 
Within Subjects : 5 a : 
| Testing condition “99.85 138.86 .001 
, Order x testing condition . 45.31' 63.01 .001 
‘' Testing condition £ : 


‘ subjects, within groups 28 672 ’ 
Results | : <i - 
Response rate. Table 3 shows the group means and standard deviatfons 
= ‘ ‘ 


o- 


Van ‘ 
4 16 


for response rate under the time-limit and no-time-limit conditions. Also ” 
shown are the results of the repeated méasureménts analysis of variance. 
Regardless of the order of testing conditions within subjects 


» Khe mean 
response rate,under the time-limit condition was,higher than the. 
response rate under the no-time-limit condition.(p<.001). 


The order of administration of testing” conditions did not cn 
significant effect upon’ response rate by itself, but the {interaction be- 
tween: the orders of administratfon and the testing conditions was signif- 
icant - (p<.001). Figure 4 shows the pattetn of this interaction. When 
the no-timé-limit condition was first, imposition of time limits led to a 
large increase in mean number of responses per minute. When the time- . : 
limit condition was administered first, there,was virtually no decrease in. . 


response rate under the no-time-limit condition. For the testees receiv- 


ing the time-limit condition first, the correlation between time-limit and ~- 
no-time-limit response rates was -90. 


_ limit ‘condition first, the corre 
limit response rates was .88. 


For those receiving the no- time- 
tion between time-limit dnd no-time-~ 


s 


) 


Figure 4 
Group Means for the Number of Responses per Minute, 
Testing Conditionyand Order of Administration 
10 


N > 


N 


i ek as first 
\ . 
N 


( 


a 


No-time-limit condition first 


> 


Mean number of responses per minute 
ny ‘ 


Time-limit No-time=jimit 
condition condition 


4 Response accuracy. Table 4 shows the group means and standard devia- 


- tions for response accuracy under the time-limit and no-time-limit condi- 
tions. Algo shown are the resulté of the repeated measurements analysis 
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of variance. There were no significant main effects or interactions. For 
the testees receiving the time-limit condition first, the correlation be- 
tween time-limit and no-time- Ajmit response accuracies was .98.% For those 
receiving the no-time-limit condition first, the cortelation between time- 
limit and no-time-limit response accuracies:was .86. Combining the data 
from both orders of administration, time-limit response rates correlated 

-08 with time-limit response accuracies, and. no-time-limit response rates 
correlated .31 with no-time=limit response accuracies. While the correla- 
tion of .08 under time-limit conditions was not significantly different: 
from zero, the correlation of .31 under no-time-limit conditions was 
significant at the .05 level. Thus, under no-time-limit conditions there * 
was a tendency for higher scoring testees to-answer test items more quickly 


i 3 7 


Table 4 
Means, Standard Deviations and Analysis of Variance 


for Proportion of, Items Answered Correctly . 


, 


Order and Type of ; . ; 
Jf ‘Testing Condition ‘ Mean s.D. 
ane ane Time-Limit €ondition First (N=15) iF 
Tout ee Both conditions combined SG 17 
; Time-limit -~. e° a? eA -16 
No-time-limit : of) +54 -17 
No- Time-Limit Condition First "(N=15) 
: ‘Both conditions compined - «46 +14 
Timerlimit! ~ , . te a ae pol5 
- : No-time- limit ; yf? -45 13 
‘ ‘Testing Condition (Orders Combined). ; 
; _Time-limit condition ° -50 .16 
. “No-t'ime-limit iia as ie se Po - 50 -16 
ae z inatyits of Variance 
Source of ‘ation .- 1 df MS Fe p 
- Between Subjects ‘ eo! 
Order of tasting conditions “’ 1 -1118 2.53 .120 
* eige2 Subjects within groups 28 0443 
_ “Within Subjects ; 
é ‘ sTesting condition ‘ 1 °.0007 41 539 
Order x testing gondition yl -0009 -49 - 504 
‘Testing conditio ‘ : 
. subjects, within groups 28 -0018 : 
os) Item response latencies. Table 5 shows the group means and standard 
4*deviations separately for tha»average response latency for correctly 
F answered ‘items and for incortectly answered items under the time-limit . s 
4. 1 and no-time-limit conditions’, These mean latencies are plotted in Figure 
bar 5. 


Regardless of the order of administration of testing conditions, 
mean ndichon ha for correct responses were shorter under the time-limit = 
condition than under the no-time-limit condition. Mean latencies for 
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=n ae 
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incorrect responses also were shorter under the time-limit condition than under 
the no~time-limit condition. In general, the mean latencies for correct 


Table -5 
Means and Standard Deviations- for Average Latency 
. ‘ of Correct and‘Incorrect Responses in Seconds 
U Correct Responses Incorrect Responses 


Order and Testing Ca@ndition Mean S.D. Mean §.D:. 
Time-Limit Condition First (N=15) x . 


Both conditions combined 7.67 3.18 10.75 4.78 
Time-limit, 7225 ~ 295 9.59 3.39 
No-time-limit 8.09 3 ‘ 11.90 5,73 
No-Time-Limit Condition First (W=15) A ky Ms 
s Both conditions combined 8. 3. 11.80 6.19 
Time- limit j ° Ls " +76 2.62 
No-time-limit % Se 15.83% 6.10 
_ Testing Condition (Orders Combined) 
Time-limit condition ‘ 2.40 8.68 
No-time Simit condition ot2. 3s 13.87 


responses were shorter than the mean latencies for incorrect responses. It is 
interest ing: to. notes however, that the differences in mean jatencies between 
correct and incorrect responses were greater under the no-time-limit condition. 
Under a time-limit Yondition subjects spent proportionately less time ‘in 


Figure 5.. na 
Mean tatenetes for Correct and Incorrect Kespénses Plotted 
for the Two Testing Conditions and Ofders of Administration 
i , ee ’ 
§ F / 


Time-limit condition first No-time-limit condition first 
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Mean latency (in seconds ) 
Mean latency (in .seconds ) 


Time-limit No-time-limit Time-limit No-time-limit 
condition condition ‘ condition condition 
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2 , 
responding to items which they answered incorrectly. 

Conclusions. Generally speaking, the results of Study 2 were in 
agreement with the results of Study 1, although the design of the two 
studies differed. Higher response rates wefe observed in this study 
under time-limit conditions, and the highest response rates were ob- 
served when the time-limit codndition followed the no-time-limit. condition. 
As in Study 1, mean response accuracy wag not affected by Phe imposition 
of time limits, but amount of time spent by testees on items answered 
correctly or incorrectly differed under th@ two administrative conditions. 
The fact that response rate and accurdcy are different variables was 
illustrated by the near zero correlation between the two variables under 
time-limit conditions and by a low positive correlation under no-time- 
limit conditions. 


eo 


Study 3 
.This sstudy was similar to Study 2 in that it employed a within- 
subjects design; it examined intra-individual variability. It was 
uniquely designed to investigate learning or practice effects that 
could result when an individual moves from one time*limit testing 
session to the next or from one no-time-limit testing session to’ another. 
Such learning effects might include different test-taking strategies. 


Method 


Procedure. The test materials, instructions, and incentives were 
the same as those used in Study 2. . In this study, however, each testee 
received the same testing condition (i.e., time-limit or no-timezlimit) 
twice. There were 15 testees in each of the two experimental Brou 


Data analyses. Repeated measurements analyses of variance were em- 
ployed. Testing condition represented,a between-subjects factor with two 
levels: time-limit condition versus.no-time-limit condition. Testing 
session represented a within-subjects factor with two levels: Session 1 
versus Sessi 2. Dependent variables werd response rate and response 
accuracy. Pearson product-moment correlations were computed between 
selected pairs of the following variables: Session 1 response rate, Ses- 
sion 1 response accuracy, Session 2 fesponse rate, and Session 2 response 
accuracy. : : 


i < ‘ 

To examine possible learning or practice effects, the variability of 
item response latencies was examined across the testing sessions. The, 
standard deviations (biased) of a subject's item response latencies were - 
computed for the first and second testing sessions. These_standard devia- , 
tions were treated as dependent variables in a repeated measurements anal- 
ysis of variance of the same type as those us€d for response rate.and 
response accuracy. . 


Results’ 


‘ 


Response rate. Table 6 shows the group means and standard deviations 
for response rate under the time-limit and no-time-limit conditions. Also 


. 


< 
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shown are the results of the’ repeated measurements analysis of variance. 
The mean response rate under the time-limit condition (8.93) was signi- 
ficantly higher (p<.001) than the mean response rate under’ the no-time- 
limit condition (6.60). .The mean response rate during the second testing 
session (8.25) was significantly higher (p<.001) than the meay response 
rate during the first testing session (7.28). 


Table 6 
Means, Standard Deviations and Analysis of Variance 
for Number of Responses per Minute 

Testing Condition and 
Testing Session : Mean 
Time-Limit Condition (N=15) : 

Both sessions combined 

Session 1 ja 

Session 2 
No-Time-Limit Condition (N=15) 

Both sessions combined 

Session 1 : 

Session 2 
Testing Session (Conditions Combined) 

Session] — 

Session 2 


i 


Analysis of Variance 
Source of Variation , + df MF )P p 


Between Subjects 
Testing condition 1 81,11 13.20 -001 
Subjects within groups 28 6.14 

Within Subjects : 
Testing session , 14.09 19.60 -001 
Testing condition x 

testing.session 4.65 5.08 -031, 
Teating session x ; 
subjects within 


The data also show that in moving from the first to the second test- 
ing session, the rate of response for the time-limit. condition increased 
more than the rate of response for the no-time-limit condition, since 
there was a significant (p<.05) interaction between testing condition and 
testing session. Figure 6 shows the pattern of this interaction. 

Under» the time-limit condition the correlation between Session 1 and 
2 response rates was .87. Under the no-time-limit condition the correla- 
tion between Session 1 and 2 response rates,was .86. . 


Response accuracy. Table 7 shows the group means and standard devia- 
tions for respohse accuracy under, the time-limit and no-time-limit condi- 
tions. Also shown are the results of the repeated measuy¥ements analysis 


« 


Figure 6 
Group Means for the Number of Responses per Minute 
by Testing Condition and Testing Session 
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= Table 7 
Means, Standard Deviations and Analysis of Variance 


for Proportion of Items Answered Correctly 
Testing Condition and 


Testing Session Y Mean S.D.» 
Time-Limit Condition (N=15) 
Both sessions combined 58 
Session 1 ; ; +57 
StS8iOn 2 -58 
No-Time-Limit Condition (N=15) 
Both’ sessions combined 
Session 1 
Session 2 \ 
Testing Session (Conditions Combined) 
Session 1 
Session 2 


, Analysis of Variance 


Source of Variation ; df MF F p 
Between Subjects ‘ 


Testing condition 1 -0040 709 =—.770 
Subjects within groups 28 +0470 
Within Subjects ~ 
* ‘Testing session : 1 -0005 -36 =. 556 
Testing condition x 
; testing session 1 -0005 +36 = .556 


Testing session x es 
f o ‘ sdejects, within groups : 28 -0013 
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of weriance. There were no significant main effects or interactions. 
Under the time-limit condition the correlatfon between Session 1 and 2 
response accuracies was .96. Under the no-time-limit condition the 
correlation between Session 1 and 2 response accuracies was -..94. Under 
the time-limit condition the correlations between response rate and ¢ 
response accuracy were +.16 and -.07 for Sessions 1 and 2, respectively. 
Under the no-time-limit condition the correlations between response rate 
and response accuracy were .03 and -.11 for Sessions 1 and 2, respec- 


tively. None of these last four correlations was significantly different 
from zero. 


Table 8 
Means, Standard Deviations and Analysis of Variance 


rea for Variability of Item Response Ldatencies 
Testing Condition and ; 


Testing Session : Mean S.D. 
Time-Limit Condition (N=15) ; 
Both sessions combined . 32 1. 
Session 1 3. ; F 
Session 2. a 1. 
No-Time-Limit Condition (N=15“% 
Both sessions combined 5. 
Session 1] 5s 
Session 2 Sc 
Testing Session (Conditions Combined) 
Session 1 4. 
Session 2. 4. 


Analysis of Variance 
Source of Variance df 


Between Subjects 
Testing condition 
Subjects within groups 28 
Within Subjects 
Testing session 1 
Testing condition x 
testing session 1 
Testing session x 
subjects, within groups 28 -83 ' 


» 


Item response latencies. Table 8 shows the group means and stan- 
dard deviations for the variability of item response latencies under the 
time-limit and no-time-limit conditions for the two testing sessions. 
Also shown are the results of the repeated measurements analysis of 
variance for that variable. The average variability of item response; 
latencies was significantly smaller (p<.01) under the time-limit condi- 

“tion (3.59) than under the no-time-limit condition (5.52). There was no 
main effect for testing sessions, byt there was a significant interaction 
between testing conditions and testing sessions (p<.05). Figure 7 shows 
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the pattern of this interaction. Under the time-limit condition the 
variability of item response latencies decreased when moving from the 


’ first to the second testing session. Under the no-time-limit condition 


the variability. of item response latencies increased when moving from 
the first to the second testing session. 


Figure 7 
Group Means for the Variability of Item Resportse Latencies 
by Testing Condition andUTesting Session 
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Conclysions. The results of Study 3 were consistent with the 
results of the two previous studies. Higher response rates were observed 
under the time-limit condition than under the no-time-limit condition, 
while response accuracy remained consistent ‘across testing conditions. 
The analysis of response latency data implies that testees may learn 
different test-taking strategies in responding to time-limit versus no- 
time-limit conditions. 


. 


Summary and Implications 


d 


Although there are wide individual differences in response rates and 
response accuracies under:both time-limit and no-time-limit conditions, 
the results of the three studies may-be summarized as follows: 


Summary 
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As expected, given the apparent differences between the 
testing conditions, time-limit response rates were higher 
than no-time-limit response rates. 


Response accuracy was consistent across time-limit and 
no-time-limit conditions. Individuals can change their 
response rates without affecting their accuracy. 


Most individuals showed increases in response rate during 
the course of a time-limit testing session, whereas con- 
sistent response rafes were typical ala 
testing sessiond. / 


When the same persons received both time-linit} and no- 
time-limit. conditions, there were high positiye correla- 
tions between time-limit response rates and ~time-limit 
response rates. ; 


There were high positive correlations between time-limit 
and no-time-limit response accuracies under the two experi- 
mental conditions. 


Under both time-limit-and no-time-limit conditions, corre- 
lations between response rates and response accuracies 
were effectively zero. The only indication of a relation- 
ship between response rate and response accuracy was in 

first study, when testees obtaining high scores on a 
no-time-limit pretest tended to have higher response rates 
under both time-limit and no-time-limit ¢onditions. 


There was a correlation between item difficulty and item 
response latency: individuals take more time in responding 
to the more difficult items. Under a time-limit condition, 
however, individuals take proportionately less time in 
responding to the more difficult items. 


Also reflecting the relationship between item difficulty 

and response latency, response latencies for incorrectly 
answered“items were longer than response latencies for cor- 
rectly answered items. But the differences between the 

mean latencies for correct and incorrect responses were less 
under a time-limit condition. 


When two short testing sessions were given in succession 
under the same administrative conditions, the mean response 
rate during the second session was higher than the mean 
rate during the first session. This was especially true 
when tests were administered under a time-limit condition. 
e 

When two short time-limit testing sessions were given in 
guccession, the variability, in item response latencies 
decreased when- moving from the first to the second testing 
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session. But when two short no-time-limit testing sessions 
were given in succession, the variability in item response 
latencies increased when moving from the first to the second 
testing session. 


The three studies showed similar results with regard to responsé 

accuracy. Given an item arrangement that ensures consistent average 
item difficulty throughout the test, there were no significant differences 
in response accuracy for time-limit versus no-time-limit conditions.: 
Thus it might appear that response accuracy, or proportion of items 
answered correctly, is an adequate index of ability level under both 

- time-limit and no-time-limit modes of administration. However, considered 
in relation to the findings on response latencies and item difficulties 
in both Study 1 and Study 2, this interpretation is suspece under time- 
limit conditions of administration. These two studies showed that under 
time-limit conditions of administration, students spent proportionately 
less time on items of higher difficulty or on items answered incorrectly. 
Since accuracy scores were not affected by the imposition of time limits, 
these data imply a test-taking strategy designed to maximize the number 
of correct responses. To implement this strategy, students apparently 
guess quickly ‘on difficult-appearing items rather than spend time trying 
to determine ‘the correct answer. By using this strategy they are able to 
maintain the same accuracy levels while increasing their response rates. 

st 4 
Consequently, if a "maximum performance" conception Hf ability 

level is adopted, scores from time-limit tests might not yield accurate 
indications of the highest’ level of item difficulty that a given testee 
is able to answer correctly. This would result from the fact that 
testees under time-limit conditions would not spend sufficient time 
attempting to solve difficult items, which they might be capable of 
solving, in an attempt to maximize the number of items answered correctly. 
This is less likely to occur under no-time-limit conditions, where 
testees appear to deliberate more on their responses to difficult items. ~ 
Thus, und@r-no-time-limit conditions they are more likely to obtain 
the correct answer on the more difficult items and to exhibit the 
maximum level of performance of which they are capable. 


Unlike response accuracies, individual response rates weregnot 
consistent across different testing conditions or across different 
testing sessions. Higher response rates were observed under time-limit 
conditions than under no-time-limit conditions. This could be due 
to the different test-taking strategies uséd bythe students. 


The test response records “obtained in Study 1 indicate that there 
is intra-individual variability in response rates within a single . 
testing session. This intra-individual variability can be intetpreted 
‘as different modes of adaptation to time-limit or no-time-limit testing. 
The response rates for most individuals in Study 2 varied whén moving 
from a time-limit to a no-time-limit condition or.from a no-time-limit 
to a time-limit condition. Given.these results, it is interesting to ~ 
. note that in Study 2 response rates under a time-limit condition correlated 
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highly with response rates under a no-time-limit condition. Although 
individuals may change their response rate under different testing 
conditions, they do so in predictable ways. Response accuracies also 
were generally predictable between testing conditions, with slightly 
lower correlations when no-time-limit tests were administered first. 
Both Studies 2 and 3 showed that there was essentially a zero 
correlation between response rate and response accuracy. 


Implications 


The results of these studies have no direct bearing on whether 


time-limit or no-time-limit testing procedures provide better measurement. 


However, the results do have some implications for the utility of 
scores derived from time-limit tests. -The typical time-limit test 
score is the number of items that are answered correctly within the 
time limit. All other things being equal, time-limit test scores 

will be higher for testees who work faster. But testees who adopt the 
test-taking strategy illustrated by the résults of these studies will 
obtain higher number-correct scores than others, working equally fast, 
who do not adopt that. strategy. The strategy, as shown by the data of 
Studies 1 and 2, involves spending proportionately less time on ° 
‘difficult test items, permitting.the testee to encounter and answer 
correctly more easy items. ; 


Thus, nmumber-correct scores on time-limit tests include at least 
four components: 1) overall response agcuracy; 2) overall response 
rate; 3) an intra-individual component due to test-taking strategy; 
_and 4) an intra-individual component due to test-taking style, or mode 
of adaptation. The first two of thesé components are uncorrelated. 
The relative contribution of the other two to time-limit scores is’ , 
unknown at present. The conglomeration of these variables into a 
single test score is likely to reduce its: relationship with actual 
ability. On the other hand, nymber correct scores on no-time-limit 
tests are more likely to be a function solely of response accuracy. 


The results’ of Study 3 provide further evidence of problems in 
the use of number correct scores in time-limit tests. . Given two 
equivalent item pools administered under identical instructions, 

tees were able to increase their response rates considerably when. 
.moving from one testing session to the mwext. These results indicate 
that “response tate ;; rather than being ‘completely determined>-by- Stable 
personality variables, and rather than being just another indication 
of ability level, .is a test-taking style or strategy that i8 amenable 
to learning or practice effects. » ‘ 


Despite all that is known about time-limit and né-time-limit 
test scores, about their relationships to one another. and about their 
reliability and validity, surprisingly little is known about the 
behavior that yields ‘test scores. Test scores are a function of test- 
taking behavior. Progress in improving the reliability and validity 
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of ability test scores could result from further studies that take an 
experimental approach to psychometrics. Further study is needed 
into the areas of test-taking styles, test-taking strategies, and 
test-taking behaviors. Computer administration of ability test items 


makes it possible to implement an experimental approach to understanding 
the determinants of ability test responses. 


. 
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